Toward Metric Indexes for Incremental Insertion and Querying
نویسندگان
چکیده
In this work we explore the use of metric index structures, which accelerate nearest neighbor queries, in the scenario where we need to interleave insertions and queries during deployment. This use-case is inspired by a real-life need in malware analysis triage, and is surprisingly understudied. Existing literature tends to either focus on only final query efficiency, often does not support incremental insertion, or does not support arbitrary distance metrics. We modify and improve three algorithms to support our scenario of incremental insertion and querying with arbitrary metrics, and evaluate them on multiple datasets and distance metrics while varying the value of k for the desired number of nearest neighbors. In doing so we determine that our improved Vantage-Point tree of Minimum-Variance performs best for this scenario.
منابع مشابه
Effect of aerobic exercise and fish oil supplements on plasma levels of inflammatory indexes in mice
Background: Exercise has positive and negative effects on immune system. Herein, we would like to investigate the effects of incremental aerobic training and fish oil supplementation on the plasma levels of CRP, CPK and IL-17 in trained mice. One of the major roles of immune system is to produce soluble or cellular components that provide the immunity against inflammatory agent. The purpo...
متن کاملRestructuring versus Non Restructuring Insertions in MDF Indexes
MDF tree is a data structure (index) that is used to speed up similarity searches in huge databases. To achieve its goal the indexes should exploit some property of the dissimilarity measure. MDF indexes assume that the dissimilarity measure can be viewed as a distance in a metric space. Moreover, in this framework is assumed that the distance is computationally very expensive and then, countin...
متن کاملIndexing the Trajectories of Moving Objects in Networks
The management of moving objects has been intensively studied in recent years. A wide and increasing range of database applications has to deal with spatial objects whose position changes continuously over time, called moving objects. The main interest of these applications is to efficiently store and query the positions of these continuously moving objects. To achieve this goal, index structur...
متن کاملDeveloping a BIM-based Spatial Ontology for Semantic Querying of 3D Property Information
With the growing dominance of complex and multi-level urban structures, current cadastral systems, which are often developed based on 2D representations, are not capable of providing unambiguous spatial information about urban properties. Therefore, the concept of 3D cadastre is proposed to support 3D digital representation of land and properties and facilitate the communication of legal owners...
متن کاملSimilarity Search In Multimedia Databases Habilitation
The M-tree is a dynamic data structure designed to index metric datasets. In this paper we introduce two dynamic techniques of building the M-tree. The first one incorporates a multi-way object insertion while the second one exploits the generalized slim-down algorithm. Usage of these techniques or even combination of them significantly increases the querying performance of the M-tree. We also ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1801.05055 شماره
صفحات -
تاریخ انتشار 2018